{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Explaining a simple OR function\n", "\n", "This notebook examines what it looks like to explain an OR function using SHAP values.\n", "Through this example, we understand how changing the background distribution affects the explanations you obtain from your [TreeExplainer][tree_doclink].\n", "\n", "[tree_doclink]: ../../../generated/shap.TreeExplainer.rst#shap.TreeExplainer\n", "\n", "It is based on a simple example with two features `is_young` and `is_female`, roughly motivated by the Titanic survival dataset where women and children were given priority during the evacuation and so were more likely to survive. In this simulated example, this effect is taken to the extreme, where all children and women survive and no adult men survive." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import xgboost\n", "from IPython.display import display\n", "\n", "import shap\n", "\n", "rng = np.random.default_rng(42)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create a dataset following an OR function" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | is_young | \n", "is_female | \n", "
|---|---|---|
| 0 | \n", "1 | \n", "0 | \n", "
| 1 | \n", "1 | \n", "1 | \n", "
| 2 | \n", "0 | \n", "0 | \n", "